fix(#5626 #5832): handle multiple content chunks & images better#5839
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes issues with handling multiple text content chunks and images in message formatting for both OpenAI and Anthropic providers. It refactors the message accumulation logic to properly collect text and image content separately, then combine them appropriately based on whether images are present.
Key changes:
- Refactors OpenAI format to use separate
content_arrayandtext_arrayfor accumulating message content - Enables image support in Anthropic format by calling
convert_imageinstead of skipping images - Adds test coverage for multiple text blocks in messages
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| crates/goose/src/providers/formats/openai.rs | Refactors message formatting to accumulate text and image content separately, then combines them based on whether images are present; adds test for multiple text blocks |
| crates/goose/src/providers/formats/anthropic.rs | Enables image content support by importing and using convert_image utility |
| if let Some(image_path) = detect_image_path(&text.text) { | ||
| // Try to load and convert the image | ||
| if let Ok(image) = load_image_file(image_path) { | ||
| converted["content"] = json!([ | ||
| {"type": "text", "text": text.text}, | ||
| convert_image(&image, image_format) | ||
| ]); | ||
| content_array.push(json!({"type": "text", "text": text.text})); | ||
| content_array.push(convert_image(&image, image_format)); | ||
| } else { | ||
| // If image loading fails, just use the text | ||
| converted["content"] = json!(text.text); | ||
| text_array.push(text.text.clone()); | ||
| } | ||
| } else { | ||
| converted["content"] = json!(text.text); | ||
| text_array.push(text.text.clone()); |
There was a problem hiding this comment.
When an image path is detected and loaded successfully, the text is added to content_array but subsequent non-image text goes to text_array. This will cause text_array content to be lost since line 243-246 only uses content_array when it's non-empty. All text should go to content_array when any images are present.
There was a problem hiding this comment.
is this legit feedback @alexhancock ? it sounds convincing
There was a problem hiding this comment.
I think it may be slop as further down is consolidated.
There was a problem hiding this comment.
yeah below I do
if !content_array.is_empty() {
converted["content"] = json!(content_array);
} else if !text_array.is_empty() {
converted["content"] = json!(text_array.join("\n"));
}
which seems right
michaelneale
left a comment
There was a problem hiding this comment.
do we have time to add this to live test coverage? (what would prompt be to show images) - would be happy if we could do that for these 2 providers @alexhancock
b83cdd7 to
69ff62f
Compare
Co-authored-by: Pradeepta Dash <[email protected]>
69ff62f to
978aa16
Compare
| if !content_array.is_empty() { | ||
| converted["content"] = json!(content_array); | ||
| } else if !text_array.is_empty() { | ||
| converted["content"] = json!(text_array.join("\n")); |
There was a problem hiding this comment.
Text content is lost when both content_array and text_array have items. If a message has text, then an image, then more text, the text in text_array is discarded because only content_array is used. Fix: merge text_array into content_array before checking, e.g., if !text_array.is_empty() { content_array.push(json!({"type": "text", "text": text_array.join("\n")})); } before line 243.
| if !content_array.is_empty() { | |
| converted["content"] = json!(content_array); | |
| } else if !text_array.is_empty() { | |
| converted["content"] = json!(text_array.join("\n")); | |
| // Merge text_array into content_array if text exists | |
| if !text_array.is_empty() { | |
| content_array.push(json!({"type": "text", "text": text_array.join("\n")})); | |
| } | |
| if !content_array.is_empty() { | |
| converted["content"] = json!(content_array); |
|
@michaelneale Good call. I will think how to get tests going, but merging this for now to have the bugfix in. |
* main: docs: add DataHub MCP server extension documentation (#5769) docs: lowercase goose in remaining topics (#5861) docs: lowercase goose in getting-started and guides topics (#5857) Fix multi tool calling (#5855) fix(#5626 #5832): handle multiple content chunks & images better (#5839) chore: some old code hanging around, and mention configure cli (#5822) feat : add support for math / science symbology via katex (#5773) feat : add ability to see error message in toast (#5851)
…etter (block#5839) Co-authored-by: Pradeepta Dash <[email protected]>
…etter (block#5839) Co-authored-by: Pradeepta Dash <[email protected]> Signed-off-by: Sai Karthik <[email protected]>
…etter (block#5839) Co-authored-by: Pradeepta Dash <[email protected]> Signed-off-by: Blair Allan <[email protected]>
Combined fix for #5832 and #5626
See original description of the fix in #5838 and then I think this combined coauthored commit should fix both issues
cc @Pdash-exceeds @michaelneale